Abstract
Background: Idiopathic multicentric Castleman disease (iMCD) is a rare disorder involving multifocal lymphadenopathy, systemic inflammation, and can impact multiple organ systems due to a poorly understood cytokine storm driven pathology. Diagnosis requires the exclusion of mimetics such as lymphoma, satisfaction of both iMCD major criteria (characteristic lymph node (LN) histopathology and multicentric lymphadenopathy), as well as at least 2 of 11 minor laboratory and clinical criteria. iMCD diagnosis is further divided into three main clinical subtypes: TAFRO (Thrombocytopenia, Anasarca, Fever, Renal dysfunction/Reticulin fibrosis, Organomegaly), IPL (Idiopathic plasmacytic lymphadenopathy), and NOS (Not Otherwise Specified). In this study, we leverage our access to the diagnostic and clinico-pathologic review by our team of expert pathologists and clinicians of nearly 400 cases of iMCD and iMCD-like mimetics to create a machine learning model that determines which lymph node features, minor clinical symptoms, and lab-based criteria are most predictive of an iMCD diagnosis and subtype classification.
Methods: 392 patients have been diagnosed and subtyped by our Certification and Access Subcommittee. For each diagnosis, pathologists gave consensus ratings of five LN features using a LN biopsy: atrophic germinal centers (GCs), hyperplastic GCs, vascularity, plasmacytosis, and follicular dendritic cell (FDC) presence. LN features coupled with patient laboratory and clinical data are used to determine an iMCD diagnosis. Lab-based and clinical criteria inflammation markers (Fever, elevated C-Reactive Protein, elevated erythrocyte sedimentation rate), anemia, thrombocytopenia, hypoalbuminemia, renal dysfunction, polyclonal hypergammaglobulinemia, constitutional symptoms, organomegaly, fluid retention, characteristic skin lesions, and lymphocytic interstitial pneumonitis. We also included megakaryocyte hyperplasia and bone marrow fibrosis. We used these LN, clinical, and lab-based criteria to create an XGBoost model to determine which features were most predictive of an iMCD diagnosis or subtype classification.
Results: We used data from patients with or without panel-confirmed iMCD to train an XGBoost model using the 18 predictors drawn from LN histopathology ratings and binary symptom indicators. Five-fold cross-validation produced a mean out-of-fold ROC-AUC of 0.86, demonstrating strong generalization. The features strongly associated (|SHAP| > 0.5) with iMCD were elevated inflammation markers and thrombocytopenia. Conversely, hyperplastic-GCs were the strongest feature negatively associated with iMCD.
For subtype discrimination within the iMCD cohort, we trained a multiclass XGBoost model on the same feature set, achieving a macro one-vs-all ROC-AUC of 0.83 under identical five-fold cross-validation. The features strongly associated (|SHAP| > 0.5) with TAFRO were thrombocytopenia, renal dysfunction, and megakaryocyte hyperplasia. The strongest features associated with IPL were the presence of polyclonal hypergammaglobulinemia and the absence of thrombocytopenia. The strongest features associated with NOS were the presence of excess vascularity in the LN, absence of thrombocytopenia, absence of polyclonal hypergammaglobulinemia, and absence of renal dysfunction.
Conclusions: Here, we used quantitative scoring of LN histopathology and diagnostic criteria, coupled with XGBoost modeling, to accurately discriminate iMCD from mimetic disorders and further resolve the three clinical subtypes. These results suggest that a machine-learned panel of LN features and routinely measured laboratory and clinical features can provide an objective decision support tool for pathologists and clinicians, potentially shortening diagnostic delays, guiding early subtype-tailored management, and serving as a reproducible benchmark for prospective validation in multi-center cohorts.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal